Interactive Visualization for Topic Model Curation
نویسندگان
چکیده
Understanding the content of a large text corpus can be assisted by topic modeling methods, but the discovered topics often do not make clear sense to human analysts. Interactive topic modeling addresses such problems by allowing a human to steer the topic model curation process (generate, interpret, diagnose, and refine). However, human have limited ability to work with the artifacts of computational topic models since they are difficult to interpret and harvest. This paper explores the nature of such challenges and provides a visual analytic solution in the context of supporting political scientists to understand the thematic content of online petition data. We use interactive topic modeling of the White House online petition data as a lens to bring up key points of discussions and to highlight the unsolved problems as well as potentials utilities of visual analytics methods. ACM Classification
منابع مشابه
Assessing the Preservation Condition of Large and Heterogeneous Electronic Records Collections with Visualization
As collections become larger in size, more complex in structure and increasingly diverse in composition, new approaches are needed to help curators assess digital files and make decisions about their long-term preservation. We present research on the use of interactive visualization to analyze file characterization information for the purpose of assessing the preservation condition of a vast co...
متن کاملGoMapMan: integration, consolidation and visualization of plant gene annotations within the MapMan ontology
GoMapMan (http://www.gomapman.org) is an open web-accessible resource for gene functional annotations in the plant sciences. It was developed to facilitate improvement, consolidation and visualization of gene annotations across several plant species. GoMapMan is based on the MapMan ontology, organized in the form of a hierarchical tree of biological concepts, which describe gene functions. Curr...
متن کاملXenbase: a genomic, epigenomic and transcriptomic model organism database
Xenbase (www.xenbase.org) is an online resource for researchers utilizing Xenopus laevis and Xenopus tropicalis, and for biomedical scientists seeking access to data generated with these model systems. Content is aggregated from a variety of external resources and also generated by in-house curation of scientific literature and bioinformatic analyses. Over the past two years many new types of c...
متن کاملHiérarchie: Interactive Visualization for Hierarchical Topic Models
Existing algorithms for understanding large collections of documents often produce output that is nearly as difficult and time consuming to interpret as reading each of the documents themselves. Topic modeling is a text understanding algorithm that discovers the “topics” or themes within a collection of documents. Tools based on topic modeling become increasingly complex as the number of topics...
متن کاملConcurrent Visualization of Relationships between Words and Topics in Topic Models
Analysis tools based on topic models are often used as a means to explore large amounts of unstructured data. Users often reason about the correctness of a model using relationships between words within the topics or topics within the model. We compute this useful contextual information as term co-occurrence and topic covariance and overlay it on top of standard topic model output via an intuit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018